Chris Pollett > Old Classes >
CS152

( Print View )

Student Corner:
  [Grades Sec1]

  [Submit Sec1]

  [Class Sign Up Sec1]

  [
Lecture Notes]
  [Discussion Board]

Course Info:
  [Texts & Links]
  [Topics/Outcomes]
  [Outcomes Matrix]
  [Grading]
  [HW/Quiz Info]
  [Exam Info]
  [Regrades]
  [Honesty]
  [Additional Policies]
  [Announcements]

HWs and Quizzes:
  [Hw1]  [Hw2]  [Hw3]
  [Hw4]  [Hw5]  [Quizzes]

Practice Exams:
  [Mid]  [Final]

                           












HW#2 --- last modified February 28 2019 22:25:29..

Solution set.

Due date: Mar 11

Files to be submitted:
  Hw2.zip

Purpose: To learn about Lex (Flex) and Yacc (Bison) and in the process create an interpreter for the command line of a simple database system.

Related Course Outcomes:

The main course outcomes covered by this assignment are:

LO5 -- Read and produce context-free grammars.

LO6 -- Write recursive-descent parsers for simple languages, by hand or with a parser generator.

LO8 -- Write interpreters for simple languages that involve arithmetic expressions, bindings of values to names, and function calls.

Specification:

For this homework you will create a very simple database management system called YoctoDB. In many real world situations it is useful to have a database system that supports both a command line interface and is tiny enough that it can also be embedded in other application which make use of it only through a C API. For instance, Firefox, the web-browser, makes use of SQLite. There is also the very classical Berkeley DB API. The O'Reilly book for Lex and Yacc contains a parser for the SQL language, so we will develop our own language for YoctoDB.

To build your project, the grader will unzip your Hw2.zip and switch into that directory. The grader will then do a "make all". The all target should depend on two subtargets: commandline and apitest. The commandline target should build a commandline interface program for YoctoDB. This should be runnable by the grader with a line like:

./yoctodb

This program should then display the yoctodb prompt:

yoctodb>

At which point the grader can enter a YoctoDB command, see its output, get a new prompt, enter a new command, see its output and continue doing this for as long as desired.

The apitest target should be runnable by the grader by typing a line like:

./apitest

The program should test the functions of your C API to YoctoDB. To use the C API, a programmer should only need to include the header yoctodb.h . Your api can have as many functions as you like; however, it must support at least the following functions:

int yoctoInit(int *instance);

int yoctoLock(int instance, int *lock);

int yoctoUnlock(int instance, int *lock);

int yoctoExec(int instance, int lock, char *cmd, int *responseSize);

int yoctoResult(int instance, int lock, char *result, int numBytes);

int yoctoDestroy(int instance);

Each of the functions above return 0 if they worked successfully and -1 otherwise. In order to understand what these functions do, we first briefly describe what a database is. For us, a database instance will be a collection of tables. A table is a collection of rows. Each row, for us, will be a tuple of strings. For example one might have the table $person(fname, lname, age) and an example might be ("Chris", "Pollett", "21+"). The goal of a database management system such as YoctoDB is to make it easy to manage databases.

We want the YoctoDB to be fast and so we want as much as possible to keep things in memory rather than on disk. We might use a yoctoExec call to read in a database into memory from disk, but thereafter, if we execute further yoctoExec command on this database, we don't want to have to re-read it in. So we use global data structures to store our databases. To fix a specific representation for these data yoctoDB manages, let's assume you use a vector array of databases, each database being a hash table of database tables, and each table being a linked-list of rows. Try to keep your design flexible. Prefix the given global variables by "yocto" to avoid namespace collisions.

yoctoInit tries to create a new empty database in the vector array of databases. If it is successful then the value the variable instance points to will be set to the index of this new database in the vector array.

yoctoDestroy is used to delete the database instance making sure to free up any memory allocated for its tables.

Now in the real world one might be trying to use yoctoDB in a threaded environment. We want to avoid situations where weird interleaving of operations from two yoctoDB using threads leave a database in an inconsistent state. To avoid this yoctoDB will support a crude database level locking: In order to use the database at all, you must have the lock for it.

yoctoLock checks if the value of the yoctoDatabase[instance].lock is 0. If it is not 0 the function returns 1. If it is 0, yoctoLock generates a random number less than some MAX_LOCK_NAME. It sets the value of yoctoDatabase[instance].lock to this number and also sets this as the value stored where lock pointer points.

yoctoUnlock checks if the value stored at lock is yoctoDatabase[instance].lock. If it isn't then yoctoUnlock returns with an error. Otherwise, yoctoUnlock sets yoctoDatabase[instance].lock = 0 .

Now in some of the discussion above I glossed over some of the checking that will be needed by the given function. So for yoctoExec I will say things in a little more detail.

yoctoExec checks that instance is less than the yoctoDatabase array size, that yoctoDatabase[instance] is not NULL, that yoctoDatabase[instance].lock is not 0 and that yoctoDatabase[instance].lock is equal to the value to which lock points. If any one of these conditions fails, then the function returns -1. Otherwise, the command pointed to by cmd is executed on yoctoDatabase[instance]. The number of bytes needed for the response that YoctoDB gives is set as the value to which responseSize points.

yoctoResult performs the same checks on instance and lock that yoctoExec does. If these succeed, then yoctoResult returns the first numByte many characters of the response from the last command issued by yoctoExec.

Probably, the easiest way to create the command line interface is to first write the C API above and then use it to build the command line program. We assume that the command line program when run sets up a blank database using yoctoInit, and then the user issues commands on this database.

There are two possible command types that can be sent to yoctoExec: control instructions and assignment instructions. A ';' is used to indicate the end of either kind of command. If the yoctoExec interpreter begins parsing a command but gets to the end of cmd without seeing a semicolon to terminate it then yoctoExec should return an error. For all commands whitespace is ignored.

Here are the possible control instructions and their function:

\import filename; -- reads the file filename executing each instruction in it in turn on the current database.

\export filename; -- writes to the file filename commands to create each of the tables in the current database.

\list; -- prints a list of the names of all the tables in the current database.

\rows tablename; -- prints tablename newline, then prints out each row contained in tablename. By convention, we assume the first row of the table contains column headings for each of the columns of the table.

\forget tablename; -- deletes tablename from the current database.

\quit; -- causes yoctoExec to return the value 1. If the command line interface program receives this value it knows it should terminate.

Assignment instruction make use of table names, strings, numbers, and variables. A table name is a dollar sign followed by 1 or more alphanumeric characters. For example, $0, $person, $employee122, etc. A string begins with a double quote and continue until the next un-escaped double quote. An escaped double quote looks like \". A number consists of 1 or more of the digits 0-9. A variable consists of a capital letter followed by an optional alphanumeric string. For example, X, HiThere, Bo2b.

An assignment instruction has a left hand side followed by "=" followed by a right hand side. The left hand side of an assignment, ignoring whitespace, is always a table name. If this table name already exists, then the instruction fails. The semantics of an assignment instruction is always to create a new table of the left hand side table name using the contents obtained from evaluating the right hand side of the assignment. There are four legal right hand side types:

  1. A list of n-tuples. Here an n-tuple consists of an open parenthesis followed by n comma separated strings, followed by a close parenthesis. An example of an assignment using this case is:
    $employee = ("First Name", "Last Name", "Age") ("Chris", "Pollett", "21+");
    
  2. A table name followed by an operation followed by another table name. An operation can be one of -, +, *. "-" means return all the rows in the first table that are not in the second table, "+" means return the union of the rows of the two tables, "*" means table the cartesian product of the rows in the two tables. An example of an assignment using this case is:
    $mytable = $table1 + $table2;
    
  3. A table name followed by an open square bracket followed by a comma separated list of numbers followed by a close square bracket. This purpose of this operation is to allow one to create a new table by projecting columns out of an old table. To calculate this right hand side one cycles through each row of table name, outputting a new row whose first column is the column from the original row given by the first number in the list between square bracket; whose second column is the column from the original row given by the second number that appeared between square brackets, and so on. For example, after the instructions
    $employee = ("First Name", "Age", "Last Name") ("Chris", "21+", "Pollett");
    $fullName = $employee[3, 1];
    
    $fullname would be the table ("Last Name", "First Name")("Pollett", "Chris").
  4. A table name followed by an open parenthesis followed by a comma separated list of variables or strings followed by a close parenthesis. The purpose of this operation is to select rows from the table that match the pattern of variables and strings. To calculate this, one cycles through each row of table name. For a given row r, one checks each column. If in the right hand side expression that column was a string s, we check if that column in r also has value s. If not, we skip to the next row. If in the right hand side expression that column was a variable, then if we haven't bound that variable we assign it the value (for the rest of the row only) the value given by that column in r. If we have bound that variable we check if the value matches the value of that column in r. If not, we skip to the next row. If, on the other hand, all columns in r match the right hand side expression pattern, then r is put into the output. For example, after:
    $setup = 
    ("a","b",c","d", "e") ("a","b","d","a","a") ("a","b","b","a","a") 
    ("a","b","b","a","b") ("c","d","d","c","a");
    
    $test = $setup(X, Y, Y, X, "a"); 
    
    $test would be the table ("a","b","b","a","a") ("c","d","d","c","a").

This completes the description of the YoctoDB system. You should use either Lex or Flex to recognize tokens. You should use either Yacc or Bison or write a recursive descent parser to handle parsing. In either case, you should create a file grammar.txt containing a grammar that could be used to recognize legal YoctoDB commands. The checking that all tuples in a list of n-tuple assignment are in fact n-tuples can't be done only using your grammar, your parse has to verify this itself, so you don't worry about this in the CFG in grammar.txt.

Point Breakdown

Departmental C++ coding guidelines for your C program followed 1pt
grammar.txt is as described 1pt
Makefile has targets as described 1pt
yoctodb.h as described and apitest, tests each of these functions. 2pts
From the yoctoDB command line, the grader was able to verify each of the 6 control instructions and 4 types of assignment intructions worked (1/2 point each). 5pts
Total10pts